ZFS ZFS and Hotswap Indicator LEDs?

CoryG · Jun 10, 2021

This may be a dumb question, but when a hotswap drive in a server fails under ZFS does the LED indicating a drive failure come on for that drive?
Another question: can hotswap drives simply be replaced under ZFS to have it handle replicating the data, or are manual actions required within the console to resolve things after replacing a failed drive?
I've never used ZFS before, but after seeing performance benchmarks vs hardware RAID controllers for database R/W times I'm interested in using it - though those two items would be critical for me.

SirDice · Jun 10, 2021

CoryG said:
but when a hotswap drive in a server fails under ZFS does the LED indicating a drive failure come on for that drive?

That's the controller doing it. It also detected the drive has errors.

CoryG said:
Another question: can hotswap drives simply be replaced under ZFS to have it handle replicating the data, or are manual actions required within the console to resolve things after replacing a failed drive?

You will need to issue the correct zfs replace commands to inform ZFS which drive to replace. It doesn't "assume" that new drive that got inserted into the system is a replacement for the broken one that got removed.

That said, you can use zfsd(8) and have it automatically replace disks with hot spares for example.

CoryG · Jun 10, 2021

So for that first one - does that mean the LEDs would still function because that's wired into the drives themselves, or that ZFS wouldn't be able to give a physical indication (e.g. just looking at a rack full of machines, not opening consoles or setting up alerts) that a drive failed?

Deleted member 67440 · Jun 10, 2021

ZFS does not give you physical indication, neither does Windows, nor Linux (AFAIK).
It is the controller.
With ZFS the whole computer becomes a "giant RAID controller" with tens/hundreds of GB of RAM, tens of cores, etc.
After a "zpool replace" the automatic resilvering (fixing) procedure starts almost immediately

Recording the exact location of physical disks is something pretty important with zfs, especially if you have tens or hundreds of them.
There are various ways to "name" the disks and to get an indication of their physical location, but (AFAIK) no blinking LED.
It is a question that has been debated for years among those who deal with these machines.

Essentially the advanced functions of the RAID controllers must be completely disabled before using zfs.
Even "normal" configurations like JBOD should be avoided.

The "dumber" the disk connections, the better.

covacat · Jun 10, 2021

zfs knows shit about underlying hardware
you can probably create a pool with a floppy disk a sd card and a ramdisk

if you use a raid controller in jbod mode it will probably take care of the leds

Deleted member 67440 · Jun 10, 2021

performance benchmarks vs hardware RAID controllers for database R/W times

In fact no, it is not the best thing to do.

For a database manager it is not so much the performance that is higher (it can typically be for data compression, more than for other reasons), as the security.

In particular the absence of an fsck / scandisk (without deduplication), which is one of the most significant elements.

It is almost (almost because anything can happen) impossible to "break" a zfs pool (not deduplicated) with logical damage, such as sudden shutdowns, freezes, SQL daemons that crash etc.

Sure they can always happen, but it's a different world than NTFS and even ext4

SirDice · Jun 10, 2021

CoryG said:
does that mean the LEDs would still function because that's wired into the drives themselves

The LEDs are wired to the controller, not the drive. Some controllers and enclosures have additional (usually) blue identity or locator LEDs that are extremely useful. But you need to turn those on yourself.

CoryG · Jun 10, 2021

So as a programmer, with basically no deep driver experience out of hardware I've been involved in designing and in turn no familiarity with how RAID controllers are engineered in that regard, my question is: is there a standard way to interface with the RAID controller in a JBoD configuration via ZFS such that if it detects a bad drive it can tell the RAID controller to light up the error light on the HDD caddy? My understanding (which could be wrong) is that the RAID controller generally detects bad blocks due to r/w failures, parity bit disparities, etc - and the parity pieces would definitely fall in the ZFS realm if it's acting as the RAID controller. Is this a functionality any RAID controllers are known to have, or that ZFS has interfaces for already, or is it just something they never designed for and there's no way to hack it into a driver?

SirDice · Jun 10, 2021

When you talk about ZFS you generally don't use RAID controllers but HBAs. Or at the very least use a controller with all the RAID functionality turned off (JBOD). The controller will only detect certain issues with the drive, like command time outs, bus errors, that sort of thing. If it gets enough "noise" from a drive it'll decide it's best to kick that device off the bus. ZFS detects the drive has fallen off and will mark it as "missing" or "offline" and act accordingly. It's not ZFS that signals the controller to disable a disk, ZFS detects the controller has kicked a drive off.

covacat · Jun 10, 2021

look at sesutil(8) and zfsd(8)

Alain De Vos · Jun 10, 2021

For the meaning of the leds you must look in the manual for the specific hardware.
When one of all the leds is behaving different, there is normally a procedure to follow.

ralphbsz · Jun 10, 2021

Simple answer: No, ZFS will not control replacement LEDs. Not without additional software, which AFAIK is not standardized and no pre-cooked package for that exists. Matter-of-fact, a general solution is nearly impossible. Writing a specific solution for specific hardware is not terribly hard; with a few hours of scripting, it can be done by a competent sys admin.

Complex answer: That's because ZFS does not know that LEDs exist. And even if it went and knew about mechanisms to control LEDs, it would not know which LED corresponds to which disk. And this is where a RAID controller is capable of doing better.

Explanation: Say we have a system with a disk enclosure, which contains multiple disks, typically hot-swappable (which often implies that the enclosure can turn power for the disk off and on), and that has multiple LEDs per disk. Typically, there will be up to three LEDs in the enclosure per drive slot: (a) A power LED. That one does not require any intelligence to control, it is directly wired to the power pins that feed the disk, perhaps in conjunction with a simple transistor that detects whether the disk is present. (b) A disk activity LED; that one is driven electrically directly from the disk drive. (c) An error or replace LED. This one is driven by the enclosure controller. If you think about it, it's obvious that it has to be controlled by the enclosure controller, not by the disk: a disk may have to be replaced because it is dead; or the admin may have to insert a disk into an empty slot, so one can not rely on the disk drive controlling the error/replace LED.

And the way that last LED is controlled is the problem. From a data path point of view, there are separate devices involved. The disk speaks to the computer over SATA protocol, or over a SCSI block protocol; this is what cause the OS to create many devices with names like /dev/da... and /dev/ada..., which are then used by the file system (for example ZFS). The enclosure controller speaks to the computer over the SCSI SES (enclosure services) protocol, which causes a single device /dev/ses... to be created for the whole enclosure.

To turn error/replace LEDs off and on requires implementing the following logic:

Make a list of all the disks that ZFS needs, translating their identity to block device names like /dev/ada... or /dev/da... that can be opened, but also storing the "serial number" (called WWN) of the disk drive.
Persistently remember that list (you'll see below why we need to store it permanently).
Find all enclosures that are connected, with names like /dev/ses...
Ask each enclosure how many disk drive slots it has.
For each slot in each enclosure, ask the enclosure whether that slot is occupied, and if yes, which WWN is in that slot.
And storing that mapping persistently again.
Ask each enclosure whether it has a controllable LED for that slot.

Now, if a drive fails, we can do the following: Ask our stored list which maps all the disks to slots where the disk is, or where it was last seen (if it is currently not communicating). If we know it, and that slot has a controllable LED, turn that LED on. There is an enormous number of special cases that need to be considered to make this work, and implementing it correctly takes literally person-years of engineering. On the other hand, a quick hack script that mostly works for a specific hardware can be put together quickly by a person experienced in scripting, and who knows the SCSI protocols.

The problem with ZFS doing all this is: ZFS is a general purpose file system. It works on all manner of hardware, including disks that are not in enclosure slots (for example I use ZFS on my home server with 4 disks, none of which have LEDs, none of which are an enclosure, and there are no /dev/ses... devices). Implementing all that logic correctly would be a heck of a lot of work, and it would only benefit a small number of users; and those users either have their own methods of dealing with it (write their own scripts, do not replace disk drives, use manual methods to track disk locations, or spend money on hardware/software/services to take care of it). Most importantly, ZFS does not make any money (it is not sold as a product), so there is no funding for implementing all this, unless someone donates or volunteers.

For a RAID controller, a lot of this is easier to do. RAID controllers already live in the world of SCSI protocols, so they know how to talk to disk drives and enclosures (someone needs to). They already need to store information about disks (namely the identity of the disk), but unlike ZFS, they can specialize that to storing the WWN. And they have a revenue stream (namely selling hardware) that can be used to implement these features. And they have a revenue stream because their users want these features.

Just to be clear: I am heavily in favor of using ZFS as the RAID layer, and not using hardware RAID.

Alain De Vos · Jun 10, 2021

I worked with professional SAN/NAS solutions and it's the hardware builder/vendor/integrator who described the purpose of the leds. Mostly under control of the disk-enclosure and zfs had no influcence as far as i remember. Altough it informed on which disks/leds I should check.
It worked with an underlying alarming system. Some software which collected info from the controllers , filesystems and raised an alarm when a disk failed.
I had to press a few buttons on the enclosure a few buttons on the failing disk and just pull the failing disk out and the new one in and press a few buttons. Then go to the software interface to start the zfs resilvering of the raid.

ralphbsz · Jun 10, 2021

Exactly: That builder/vendor/integrator did the connecting of LEDs (and in your case switches) to the ZFS file system. That is a lot of work to implement, BTDT.

ZFS ZFS and Hotswap Indicator LEDs?

CoryG

SirDice

Administrator

CoryG

Deleted member 67440

Guest

covacat

Deleted member 67440

Guest

SirDice

Administrator

CoryG

SirDice

Administrator

covacat

Alain De Vos

ralphbsz

Alain De Vos

ralphbsz