ZFS Best practice for specifying disks (vdevs) for ZFS pools in 2021?

jauh · Mar 4, 2021

This started out as a "why isn't this working?" question, but after a full day of trial and error, I think it might help more people to discuss "what's the best approach in today's world? (FreeBSD 12.2)"

How should disks (or vdev's) be identified when creating ZFS pools in 2021?
(and implicitly, what conventions are obsolete and should be avoided)?

i.e. zpool create {pool} [raidz[123]] {what goes here?...}

TL;DR, is the current wisdom still "wrap your ZFS vdevs in labelled GPT partitions" or is there another alternative that I've missed?

What I picked up over the years:

1. don't bother formatting disks for ZFS, just give ZFS the whole disk

Rationale: ZFS takes care of everything

2. don't use `/dev/da[0-9]+`

Rationale: device numbers can change over time (esp. USB devices etc.), making it difficult to identify which device number refers to which physical disk when things go wrong.
Even so, this is what the FreeBSD Handbook uses (for simplicity?). The examples in section 20.3.6 at least show the "<random number> UNAVAIL ... was /dev/da0" problem but they don't go into the topic of identifying which physical hardware /dev/da0 is referring to.

3. do use `/dev/diskid/<id>`

Rationale: diskid's are hardware-derived (typically from the serial number) and thus always remain consistent.
They seemed like an elegant, 1:1 mapping between hardware and software.
This no longer seems to be true? (see below)

4. alternative, use `gpart` to label a partition on each disk and then use `/dev/gpt/<label>`

Rationale: gpart stores your label in the partition table on disk, so these labels are also semi-permanent (i.e.: until you re-format the disk).
Also, you can set the label to anything you like, e.g.: drive-bay numbers or short, easily identifiable ID's which you can stick on the outside of each drive.

5. alternative, you could use `/dev/gptid/<id>`

Rationale: they should also be consistent - but I've never used them, they don't appear to relate to anything found on the devices I have and they are anything but human-friendly. Where do they even come from?

6. finally, regardless of how you create your pool, ZFS will find the vdevs, if they are there

ZFS apparently scans all devices in /dev/... when looking for pools to import.
This means that it will find and mount your pools no matter what you do, but the name of each vdev may be very different to the name you used when creating the pool.

I've been using #1 and #3 for the last 10 years and it's been great.
Whenever a disk had a problem, zpool status said "OFFLINE ... was /dev/diskid/<id>" and I knew exactly which drive to clear, online or replace and all was well. (I labelled each disk with the last few characters of the serial number, which matched the diskid).

The thing that triggered this excursion for me was, I bought some SSD's the other day, to replace my old HDD's...
however, whereas my 10-year old HDD's each had permanent diskid's, the SSD are showing very strange behaviour.

The new drives are SATA SSD's from two different manufacturers, each encased in an individual USB/SATA enclosure, also from two different manufacturers.
This gives me 4 possible case/SSD combinations, which I spent the last day trying out in order to get stable ID's that I could label the disks with...

What I have discovered with my new SSD's and/or USB enclosures:

- The delock 42617 SSD cases all(!) show up with the same diskid, namely DISK-000000123DE9 (123?! really?!) - so diskids are useless when using these USB enclosures.
=> #3 above is obsolete - don't use /dev/diskid/... any more!
(or the diskid mechanism needs to be updated to cope with these, bizarre, devices - whack-a-mole anyone?)

- Also, the serial numbers reported in /var/log/messages are only sometimes related to the actual serial number on the outside of the disk, but are often quite different (e.g. completely different prefix or the last character might be a number instead of a letter. This also seems to be the same, in my view, broken mechanism used by diskinfo (see below))

- Turning OFF diskid's by adding kern.geom.label.disk_ident.enable=0 to /boot/loader.conf at least removes the diskid confusion.

- I tried using /dev/da[0-9]+ - but was quickly able to produce situations where it was not obvious which physical disk was in trouble.

- The numerical ID's provided by ZFS are of absolutely no help! (e.g. zpool status -g)

- Setting up a GPT partition with a name derived from the serial number seems to be the way to go, but getting the actual serial number was not obvious (to me at least).

- I eventually found: label=$( camcontrol identify da0 | sed -n 's/.*serial number.*$.\{4\}$$/\1/p' ) which works nicely

- I thought the new SSD's also introduced a problem that corrupted the 'secondary GPT table' during boot - but after much experimentation, this seems to be a hardware issue one one of my machines, as I was not able to reproduce this behaviour on my second, identical machine (of course I started on the one that had problems... Murphy... grrr).

Conclusion

This all seems quite hacky, non-obvious and brittle, just to get stable, consistent device names.
Using GPT labels seems to be the only way to go, but it also feels like individually wrapping bananas in plastic, since ZFS would be quite happy to take the whole drive - but how do I tell zpool which physical piece of hardware I'm referring to, in a way that will still be consistent in 5-10 years time, after any number of reboots, relocations, motherboard replacements etc?

So, is the current wisdom still "wrap your ZFS vdevs in GPT partitions" or is there another alternative that I've missed?

Thanks in advance,

Jauh

PS: Here are some of the useful commands I found to help debug my situation:

- obviously, /var/log/messages is the first place to look, e.g. grep -i '$boot\|geom\|da[0-9]*:\|usb$' /var/log/messages
- gpart status and glabel status to see which disk is mounted where (only works with GPT formatted disks)
- geom -t also gives a nice overview of your storage devices
- camcontrol identify <device> for reading detailed information such as serial numbers, make and model, as well as several SMART parameters.
- usbconfig list to see which USB devices are available. (Sidenote: sometimes, after rebooting, mine only get HIGH speed (480Mbps) instead of SUPER speed (5.0Gbps) - worth keeping an eye on)
- diskinfo -v <device> and diskinfo -s <device> also shows information about the disks, like their serial number, but it also picks up the 000000123DE9 ID's fudged by the delock 42617 cases - in other words, don't rely on this!

ralphbsz · Mar 4, 2021

jauh said:
don't bother formatting disks for ZFS, just give ZFS the whole disk

Works, but has many disadvantages. Biggest one: If a disk gets separated from the pool (for example because the pool is not working), you have no idea how to identify the disk electronically. I would always leave an electronic label stored on the disk (we'll get to that in the /dev/gpt... section).

don't use /dev/da[0-9]+

Mostly true. Those numbers change all the time, they depend on discovery order. If you use them, use them for VERY short stretches. Like if you have moments ago prepared /dev/da42 to be ready to go, you can "zpool add /dev/da42 ..." it a moment later. But don't remember those numbers, they're fleeting.

do use /dev/diskid/<id>

Not a bad idea, but not a great one. The id's are derived from the serial number, which is typically printed on the paper label on the disk. If you are using physical disks, this can be made to work, but is inconvenient, as these numbers tend to be long and non-intuitive. Linux has a way to do something similar, except they use the WWN of the disk (world-wide name, which is guaranteed unique, and in a consistent format between all vendors). Same disadvantages. Both systems work dubiously on VMs with virtual disks.

alternative, use gpart to label a partition on each disk and then use /dev/gpt/<label>

This is the best way. At this point, the disks become self-describing: You say use the name "home_pool_disk_7", and immediately know what's going on. Ideally, you use use a felt-tip pen and write this physically on the disk, so if the disk is removed, you still know what is going on. Pro tip: Put a piece of scotch tape on the disk (not over the breathing hole), write on the tape. Like that if the disk is reused, you can peel the tape off and relabel the disk.

alternative, you could use /dev/gptid/<id>

Those IDs are a world-wide unique has, derived from the disk ID (or WWN?) and partition ID. All the disadvantages of the above: not human readable, but at least unique and consistent.

finally, regardless of how you create your pool, ZFS will find the vdevs, if they are there

True as long as everything is working correctly. Breaks when things break. That's why labelling disks (both physically and in software) is a good idea.

I used to work on storage servers that had between 200 and 700 disks EACH, and sometimes a few dozen of these server in one installation. You get to be very careful about organizing disks. We didn't use physical labels on the disks (except in testing labs, where disks get removed all the time). But every disk had a human-readable (*) electronic label stored on the platter in two places, which encoded its geographic location (bay 3 aisle F rack 2, 4th enclosure from the bottom), its position in the enclosure (row 2 column 3 depth 4), and some text indicating use (like /home or /boot, except more complicated in large systems). This information was both on each drive, and then stored in redundant form on the cluster, and was cross-correlated against the physical WWN of each disk. So we could recognize bizarre cases like a disk pretending to be someone it isn't.

(*) Footnote: Human readable means: If you do "hexdump -C /dev/da42 | more", the label is easy to find without waiting too long, and easy to read.

Favorite failure cases: Untrained field service technician pulls all disks out to do some major repair on the backplane, stacks them haphazardly on a lab cart. Lab cart falls over, big mess of disks on the floor. Calls and asks "what do I do now, where do I put them back". Answer: Put them in any order, start getting them re-integrated, read the warning messages about disks being in the wrong places. Either pull them out again and label them with a felt-tip pen and put them in the right places, or leave them and explain to the customer why the warning messages are not harmful.

On your SSDs that are pretending to have wrong IDs: Try using "camcontrol identify" or "camcontrol inquiry" on them. That should go to the disk hardware itself, and ask it for the WWN. The WWN should be printed on the label, but usually in a tiny font (often right below a bar code). The very last digit of the WWN might be off by a little bit: SCSI disks can have multiple WWNs, one for the disk itself, one for the LU on it, one for each port, and depending on how you ask, you might get them, but the WWNs should always show up in groups of 4. If your USB/SATA adapter messes with the WWN, then throw it in the trash; that's a terribly rude thing to do. I see you already discovered doing with with camcontrol for the serial number, which works too ... except that the WWNs are guaranteed to be in the same format for all vendors.

I thought the new SSD's also introduced a problem that corrupted the 'secondary GPT table' during boot - but after much experimentation, this seems to be a hardware issue one one of my machines, as I was not able to reproduce this behaviour on my second, identical machine (of course I started on the one that had problems... Murphy... grrr).

The GPT is stored on two places on the disk. There used to be BIOSes that "helpfully" fix conflicts between the two copies of the GPT, by silently overwriting one of them from the other. This causes terrible problems if the thing the BIOS thought was the correct GPT actually wasn't one, and on the next boot, all your GPTs have vanished. I don't know whether such murderously "helpful" BIOSes still exist in the wild, but we had this in our testing lab for a while. This is one of the reason why I now say that every disk must have a correctly formatted and intelligible GPT stored validly.

This all seems quite hacky, non-obvious and brittle, just to get stable, consistent device names.

Yes, you see 50 years of history here.

And we haven't even discussed yet another problem: How do you match the disk to its enclosure and the location in the enclosure? So with all the above, you have managed to determine that /dev/da456 (on Linux it would be called /dev/sdxyz) has WWN 5001517959194089, and the only partition on it is called "srv42_home_dsk11". Now please figure out which of the 6 or 8 attached SCSI enclosures (each with ~80 drives) this thing is in, and then what exact location in the enclosure (row 3 column 4). And figure out how to make the little blue "replace" light next to it blink, not the replace light next to any of the other ~600 disks attached to this server, nor the other ~20,000 disks in the server room. This is actually still doable, by using SES to communicate with the enclosure. Where it gets really nasty: How do you know which building, aisle and row this server and its enclosures are installed in? How do you know which position in the rack each of the 6 enclosures is in (counting from the bottom)? What if the helpful field service technician who was fixing a broken SAS cable switched two of them around? Those are things where an individual computer alone can't help any more.

So, is the current wisdom still "wrap your ZFS vdevs in GPT partitions" or is there another alternative that I've missed?

For the scale of home systems, that's the best thing to do I know of. For 1-disk systems (laptops...), just stop worrying about it. For systems with hundreds or tenthousands of disks, hire a team of software engineers to implement management systems for this stuff.

BjarneB · Mar 4, 2021

This is actually where linux shine - using wwn numbers.

All disks have a uniq WWN number and in linux this is accessed in /dev/disk/by-id, for exmple:
x0:~ # l /dev/disk/by-id/wwn*
lrwxrwxrwx 1 root root 9 Mar 3 17:18 /dev/disk/by-id/wwn-0x5000039172080eea -> ../../sdb
lrwxrwxrwx 1 root root 10 Mar 3 17:18 /dev/disk/by-id/wwn-0x5000039172080eea-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Mar 3 17:18 /dev/disk/by-id/wwn-0x5000039172080eea-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Mar 3 17:18 /dev/disk/by-id/wwn-0x5000039172080eea-part3 -> ../../sdb3
lrwxrwxrwx 1 root root 10 Mar 3 17:18 /dev/disk/by-id/wwn-0x5000039172080eea-part4 -> ../../sdb4
lrwxrwxrwx 1 root root 9 Mar 3 17:18 /dev/disk/by-id/wwn-0x500003996be80286 -> ../../sdf

Using the wwn numbers in configurations is really the only way to go. I manage some very large oracle servers with a lot of
SAN disks and getting them mixed up on reboot is not an option.

I can't find anything similar on FreeBSD, I did not know about /dev/diskid it does not show up on my systems.
What I do, is to find the WWN numbers on any new disk using smartcl and label the disk with the wwn number. The labels are then used
by zfs pools. To make physical replacement easier, I put a physical label the drive.
glabel has a limitation on label length, so I shorten to something like: wwn-X-6c8001b3
Example from a server:

dm$ glabel status
Name Status Components
label/wwn-X-6c8001b3 N/A da0
label/wwn-X-6be80286 N/A da1
label/wwn-X-cb980f6a N/A da2
label/wwn-X-cc50124a N/A da3

My homelab is running as virtual severs on kvm and this setup works like a charm, In a virtual environment, the disknames are not
easily recognizable anyway, so this mapping works great.

usdmatt · Mar 4, 2021

I generally gpt partition everything these days and have a data-{disk-serial} label for zfs to use.

Most the time I'm working with cheap systems that don't have fancy ident lights and when you have to shut a system down to pull a disk, it's nice to be able to match up the physical label on the disk to the faulted line in status output.

ralphbsz · Mar 5, 2021

BjarneB said:
This is actually where linux shine - using wwn numbers.

The disk serial numbers (which FreeBSD uses in the /dev/diskid directory) are nearly as good as WWNs.

WWNs are "guaranteed" to be world-wide unique, across all vendors. I'll comment on how well that works in the real world below. They are relatively short, 16 characters when printed in hex, 64 bits. Quite convenient.

Disk serial numbers are not guaranteed to be unique: in theory, both Seagate and Hitachi could make a disk whose serial number is "12345". In practice, this never happens, since their serial numbers always have an alpha prefix that identifies the vendor. But they can be longer than 16 characters, and can't be parsed into a convenient 64-bit integer.

So other than the binary-vs-text and length inconveniences, they work the same.

And now, my rant that WWNs are not actually unique, nor are MAC addresses of Ethernet ports. First, if you work with experimental prototype hardware (which consumers never do), you can find disks that have a WWN of zero. Sometimes multiple disks, all of which claim to have WWN zero. This creates hell for distinguishing them, and the problem is not even creating partitions: the underlying infrastructure (such as SAS expanders) which identifies disks and ports by WWN gets confused. But in such a case, nothing else will help either, as these devices deliberately have no distinguishing characteristics, other than the big "prototype" sticker with a hand-written serial number that's attached. Similarly, I've seen really cheap motherboards where MAC addresses are set wrong and repeat; having multiple computers with the same MAC address on the network causes hell for routers and bridges. Those were not prototypes, just cheap hardware bought at Fry's.

jauh · Mar 6, 2021

Thanks for your replies guys!

So GPT it is... now all I have to do is work out why they are being corrupted at boot (and how to avoid it)....

I enjoyed the discussion, especially the "WWNs are guaranteed to be unique... except when they aren't" bit *facepalm*.

I still prefer the diskid approach, but unless we actually get the serial number, its a bit of a mute point - pity.

cheerio!

PS: I'm working with 8 drives over 2 SOHO machines - I feel your pain!

sko · Mar 6, 2021

jauh said:
So GPT it is...

Well, it depends...

ZFS doesn't really care about how it represents the name of a provider. You can easily change it with kern.geom.label.disk_ident.enable, kern.geom.label.gpt.enable and kern.geom.label.gptid.enable.

On most systems I follow the practice of creating a single partition, giving it a meaningful label (I'm using 2-letter Vendor abbreviation (WD, HG, IN...) and the last 8 characters of the serial number) and using that as the provider name for ZFS (i.e. setting kern.geom.label.gpt.enable=1). That's also what _ALWAYS_ goes on the label on the drive caddy, even if a disk isn't used for ZFS.
On systems with older HBAs and/or simple 'stupid' backplanes I then have a direct mapping between the device name in the pool and the physical world, but on the system itself I have to manually convert between that label and the device node. But by using the serial number, one can also double- and triple check e.g. via smartctl or looking at the disk itself.

However, on systems with expanders/enclosures that are supported by sesutil, you usually get an output like this:

Code:

# sesutil show
ses0: <LSI SAS3x28 0501>; ID: 500304800189ee3f
Desc     Dev     Model                     Ident                Size/Status
Slot00   da0     IBM-ESXS HUS726040AL421   N8G6X93Y             4T
Slot01   da1     HGST HUS724030ALS641      P9K16X7W             3T
Slot02   -       -                         -                    Not Installed
Slot03   da2     ATA KINGSTON SV300S3      50026B726204CF6F     120G
Slot04   da3     ATA KINGSTON SA400S3      50026B7380790439     120G
Slot05   da4     ATA KINGSTON SV300S3      50026B726204D271     OK,  Swapped
Slot06   da5     IBM-ESXS HUS726040AL421   NHGBG83Y             4T
Slot07   da6     HGST HUS724030ALS640      P9K6603W             3T
Slot08   -       -                         -                    Not Installed
Slot09   da7     ATA WDC WD4000FYYZ-0      WD-WCC136RRHX0T      4T
Slot10   da8     HGST HUS724030ALS641      P9JLWJ8W             3T
Slot11   da9     ATA WDC WD4003FRYZ-0      V1GJJAJB             OK,  Swapped

So I get a nice mapping of the short, simple to remember device node (daN) and the slot number it sits in. If I'd use GPT labels for the zfs providers, I'd always have to manually map that label with the device node - so using 'daN' would much simpler, faster and less error-prone when things catch fire and everyone is screaming.
sesutil locate also takes the device node as an argument, so again, it is easier to also use it for the ZFS provider. zpool status (or logs/monitoring...) shows da12 is dead? Just issue a sesutil locate da12 and you (or your minion at the DC) can start searching for the drive. (If you're nice you can also tell him the slot number)

BUT:
As using the whole disk hast several disadvantages as already mentioned (no indication the disc has data on it, minor size differences between vendors etc), I still create a single partition, but I use that partitions device node for the ZFS provider (i.e. da9p1). This way I still have the nice, easy and fast mapping between the provider name and sesutil, plus I can additionally use the 'traditional' way of labeling the partition with <vendor> <serial>. And because I also ALWAYS have the labels with vendor/serial no on the caddies, no matter what variant I've used for provider names, I can use that for cross-verification. This practice also makes disk preparation consistent no matter in what type of system the disk is placed (dumb backplane or enclosure) - the only difference is the name I use to attach the drive to the ZFS pool.

Nice bonus: when transfering a pool from an enclosure to a dumb backplane or vice versa, the kern.geom.label.xxx sysctls dictate how the provider name is represented and as the disk layout and labeling is the same for both variants, I absolutely don't care in which system the drive sits in - FreeBSD does the right thing for me (given I've set the sysctls appropriately in /boot/loader.conf ).

There might be more advanced solutions to that problem, but that's what I found the most usable, consistent and easy to handle for the last few years.

Oznaiust · Aug 7, 2021

Nice to see different views of use cases. I'd like to add one for the limited resources @home. Because i couldn't afford 4 disks at once i bought a disk every year. My plan is to schedule one disk exchange per year. Because i don't have caddies i can't identify a disk per serial number that easily. I found that even at home the installation position is the most important information in case of disk failure. I started to use gpt/labels with the installation slot position to build the raid but when i replace a disk the slot position is the same. The next important information for a scheduled replacement is the age of the disk. A label of slot/date would give me all needed information with a single zpool status:

Code:

        NAME              STATE     READ WRITE CKSUM
        tank              ONLINE       0     0     0
          raidz1-0        ONLINE       0     0     0
            gpt/S2-18jul  ONLINE       0     0     0
            gpt/S3-17jun  ONLINE       0     0     0
            gpt/S1-19jun  ONLINE       0     0     0
            gpt/S4-16feb  ONLINE       0     0     0

I use the fabrication date of the disks but because of mixed types it could be more useful to set the end of warranty date to the gpt label. It will tell you to discard a faulty disk immediately or to initiate a warranty case.

mtu · Aug 8, 2021

Another advantage of GPT partitions as part of vdevs: You can manually make the partition sizes fit in case you're dealing with devices being of slightly different sizes (couple of MBs in difference). I think ZFS is equipped to handle some discrepancy in device sizes when they're part of the same vdev, but I like having an extra safety net there.

mer · Aug 8, 2021

mtu said:
You can manually make the partition sizes fit in case you're dealing with devices being of slightly different sizes

This is a key bit right here. 1GB is not equal across manufacturers, heck sometimes not even among devices from the same manufacturing lot.
ZFS typically will default to the smallest size for the whole vdev (unless you are simply striping).
You have a mirror pair of 2TB devices, one fails, you put in a 4TB, ZFS treats it as a 2TB mirror. But you can wait for the resilver to finish, replace the 2TB with another 4TB, wait for resilver then grow the partition to create a 4TB mirror without a lot of hassle.
Raidz stuff, similar things happens with sizing.
I've always preferred using partitions, because you can make them the same size across devices and there is no guesswork.

Labels: I prefer gpt labeling things and using that in the definition. I've been using the serial number on the label of the device because that is pretty trivial to match up. If I had everything in removable bays, I'd do something different.

Cath O'Deray · Aug 9, 2021

ralphbsz said:
… 1-disk systems (laptops...), …

I use GPT labelling for a removable cache device (Kingston DataTraveler G4) and this line within a pre-sleep script:

zpool offline copperbowl gpt/cache-copperbowl

With the labelling: no need to discover the da⋯ number of the device after waking the computer, before bringing the cache online.

Generally

For simple display of information, I prefer sysutils/lsblk to gpart(8) for three reasons:

devices to the left
neat alignment of columns
orderly listing:

Code:

% lsblk
DEVICE         MAJ:MIN SIZE TYPE                                          LABEL MOUNT
ada0             0:127 466G GPT                                               - -
  ada0p1         0:129 200M efi                                    gpt/efiboot0 -
  ada0p2         0:131 512K freebsd-boot                           gpt/gptboot0 -
  <FREE>         -:-   492K -                                                 - -
  ada0p3         0:133  16G freebsd-swap                              gpt/swap0 SWAP
  ada0p3.eli     1:32   16G freebsd-swap                                      - SWAP
  ada0p4         0:135 450G freebsd-zfs                                gpt/zfs0 <ZFS>
  ada0p4.eli     0:146 450G zfs                                               - -
  <FREE>         -:-   4.0K -                                                 - -
cd0              0:137   0B -                                                 - -
da0              1:60   14G GPT                                               - -
  <FREE>         -:-   1.0M -                                                 - -
  da0p1          1:61   14G freebsd-zfs                     gpt/cache-transcend <ZFS>
da1              0:214  29G GPT                                               - -
  <FREE>         -:-   1.0M -                                                 - -
  da1p1          0:215  29G freebsd-zfs                    gpt/cache-copperbowl <ZFS>
da2              1:71  466G GPT                                               - -
  <FREE>         -:-   1.0M -                                                 - -
  da2p1          1:72  466G freebsd-zfs                           gpt/Transcend <ZFS>
%

A disorderly 1, 0, 2 example:

Code:

% gpart show -l
=>       40  976773088  ada0  GPT  (466G)
         40     409600     1  efiboot0  (200M)
     409640       1024     2  gptboot0  (512K)
     410664        984        - free -  (492K)
     411648   33554432     3  swap0  (16G)
   33966080  942807040     4  zfs0  (450G)
  976773120          8        - free -  (4.0K)

=>      34  61472701  da1  GPT  (29G)
        34      2014       - free -  (1.0M)
      2048  61470687    1  cache-copperbowl  (29G)

=>      34  30310333  da0  GPT  (14G)
        34      2014       - free -  (1.0M)
      2048  30308319    1  cache-transcend  (14G)

=>       34  976773101  da2  GPT  (466G)
         34       2014       - free -  (1.0M)
       2048  976771087    1  Transcend  (466G)

%

ab2k · Aug 18, 2021

As for me, GPT labeling is still the best method, as it were in 201x and as for now in 2021.
You can make labels very descriptive like this:

Code:

data-1-sces3-3tb-Z1Y0P0DK
<pool>-<pool-id>-<disk-vendor-and-model-name>-<size-of-disk>-<disk-serial-number>

Naming in this way will help you with these:
1. Easily understand topology of defined pools.
2. Easily find vendor name and model name of drives used.
3. Easily find disk capacities.
4. Easily identify and find a bad disk(s) in the drive cage(s) while you include a serial number printed on drive inside a GPT label.

Alain De Vos · Aug 18, 2021

I use /dev/da* and /dev/ad* as zpool devices and I never had any problem.
[Just linux seem to have problems with these zpools.]

ab2k · Aug 18, 2021

Alain De Vos said:
I use /dev/da* and /dev/ad* as zpool devices and I never had any problem.
[Just linux seem to have problems with these zpools.]

If you have a small server with only a few disks - then it's more than enough. But if you will need to upgrade this server or change something that will require re-connecting drives from controller (change server case for example) or change the controller itself you should keep somewhere which disks corresponds to that da* ones - it will save you hours when disks changed it's numbering or naming. That's why labeling technique exists - you just adding an abstract layer to drives and it will help you change anything you want the way you want - for example change order of disks in rack for better cooling, change hba, etc... without the need to spend a ton of hours to get correct drives configuration back. Also, please note, if you change hba - you may not get those da* drives naming back, as controller driver may have it's own naming.

ZFS Best practice for specifying disks (vdevs) for ZFS pools in 2021?

1. don't bother formatting disks for ZFS, just give ZFS the whole disk​

2. don't use /dev/da[0-9]+​

3. do use /dev/diskid/<id>​

4. alternative, use gpart to label a partition on each disk and then use /dev/gpt/<label>​

5. alternative, you could use /dev/gptid/<id>​

6. finally, regardless of how you create your pool, ZFS will find the vdevs, if they are there​

What I have discovered with my new SSD's and/or USB enclosures:​

Conclusion​

1. don't bother formatting disks for ZFS, just give ZFS the whole disk

2. don't use `/dev/da[0-9]+`

3. do use `/dev/diskid/<id>`

4. alternative, use `gpart` to label a partition on each disk and then use `/dev/gpt/<label>`

5. alternative, you could use `/dev/gptid/<id>`

6. finally, regardless of how you create your pool, ZFS will find the vdevs, if they are there

What I have discovered with my new SSD's and/or USB enclosures:

Conclusion